Improving ascertainment of suicidal ideation and suicide attempt with natural language processing

Methods relying on diagnostic codes to identify suicidal ideation and suicide attempt in Electronic Health Records (EHRs) at scale are suboptimal because suicide-related outcomes are heavily under-coded. We propose to improve the ascertainment of suicidal outcomes using natural language processing (NLP). We developed information retrieval methodologies to search over 200 million notes from the Vanderbilt EHR. Suicide query terms were extracted using word2vec. A weakly supervised approach was designed to label cases of suicidal outcomes. The NLP validation of the top 200 retrieved patients showed high performance for suicidal ideation (area under the receiver operator curve [AUROC]: 98.6, 95% confidence interval [CI] 97.1–99.5) and suicide attempt (AUROC: 97.3, 95% CI 95.2–98.7). Case extraction produced the best performance when combining NLP and diagnostic codes and when accounting for negated suicide expressions in notes. Overall, we demonstrated that scalable and accurate NLP methods can be developed to identify suicidal behavior in EHRs to enhance prevention efforts, predictive models, and precision medicine.


Improving ascertainment of suicidal ideation and suicide attempt with natural language processing
Supplemental Tables   Table S1 ICD9CM codes for suicidal ideation.

Table S5
Top 50 keywords generated by word2vec as semantically similar to 'suicide' and 'suicidal'.

Table S6
Query terms used to retrieve suicidal ideation and suicide attempt.

Table S7
Comparative analysis for extracting the top K highest ranked suicidal ideation and suicide attempt patients using various configurations of the suicide label assignment method.

Table S8
Suicidal ideation and suicide attempt cases extracted from the EHR.    Asphyxiation due to plastic bag, intentional self-harm T71.124* Asphyxiation due to plastic bag, undetermined T71.132* Asphyxiation due to being trapped in bed linens, intentional self-harm T71.134* Asphyxiation due to being trapped in bed linens, undetermined T71.144* Asphyxiation due to smothering under another person's body (in bed), undetermined T71.152* Asphyxiation due to smothering in furniture, intentional self-harm T71.154* Asphyxiation due to smothering in furniture, undetermined T71.162* Asphyxiation due to hanging, intentional self-harm T71.164* Asphyxiation due to hanging, undetermined T71.192* Asphyxiation due to mechanical threat to breathing due to other causes, intentional self-harm T71.194* Asphyxiation due to mechanical threat to breathing due to other causes, undetermined T71.222* Asphyxiation due to being trapped in a car trunk, intentional self-harm T71.224* Asphyxiation due to being trapped in a car trunk, undetermined T71.232* Asphyxiation due to being trapped in a (discarded) refrigerator, intentional self-harm T71.234* Asphyxiation due to being trapped in a (discarded) refrigerator, undetermined X71* Intentional self-harm by drowning and submersion X72* Intentional self-harm by handgun discharge X73* Intentional self-harm by rifle, shotgun and larger firearm discharge X74* Intentional self-harm by other and unspecified firearm and gun discharge X75* Intentional self-harm by explosive material X76* Intentional self-harm by smoke, fire and flames X77* Intentional self-harm by steam, hot vapors and hot objects X78* Intentional self-harm by sharp object X79* Intentional self-harm by blunt object X80* Intentional self-harm by jumping from a high place X81* Intentional self-harm by jumping or lying in front of moving object X82* Intentional self-harm by crashing of motor vehicle X83* Intentional self-harm by other specified means Y21* Drowning and submersion, undetermined intent Y22* Handgun discharge, undetermined intent Y23* Rifle, shotgun and larger firearm discharge, undetermined intent Y24* Other and unspecified firearm discharge, undetermined intent Y25* Contact with explosive material, undetermined intent Y26* Exposure to smoke, fire and flames, undetermined intent Y27* Contact with steam, hot vapors and hot objects, undetermined intent Y28* Contact with sharp object, undetermined intent Y29* Contact with blunt object, undetermined intent Y30* Falling, jumping or pushed from a high place, undetermined intent Y31* Falling, lying or running before or into moving object, undetermined intent Y32* Crashing of motor vehicle, undetermined intent Y33* Other specified events, undetermined intent Z91.5 Personal history of self-harm Table S8 Suicidal ideation (SI) and suicide attempt (SA) cases extracted from: 1) the top ranked patients by the NLP system, 2) patients manually labeled as cases, 3) patients with positive assertions for suicidal ideation and suicide attempt in their psychiatric forms, and 4) patients with ICD10CM codes for selfinjurious thoughts and behaviors.